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ABSTRACT 

Plant-bacteria interactions result from reciprocal recognition between both species. These interactions are 
responsible for essential biological processes in plant development and health status. Here, we present a 
review of the methodologies applied to investigate shifts in bacterial communities associated with plants. 
A description of techniques is made from initial isolations to culture-independent approaches focusing on 
quantitative Polymerase Chain Reaction in real time (qPCR), Denaturing Gradient Gel Electrophoresis 
(DGGE), clone library construction and analysis, the application of multivariate analyses to microbial 
ecology data and the upcoming high throughput methodologies such as microarrays and pyrosequencing. 
This review supplies information about the development of traditional methods and a general overview 
about the new insights into bacterial communities associated with plants. 

Key words: Plant-bacteria interactions, molecular techniques, multivariate analysis, endophytes, 
rhizosphere 



INTRODUCTION 

In nature, bacteria are mainly found in association with 
different species, composing bacterial communities. These 
communities occupy all terrestrial niches, colonizing 
environments such as soil, water, air, plants and animals. 

In plants, bacterial communities are associated with 
different tissues; leaves and roots or as endophytes in inner 



parts. These bacteria are active in processes of plant 
development, nutrient supply, plant growth promotion and 
protection against pathogens. The present review explores 
aspects of assessing shifts in bacterial communities to 
monitor environmental changes. Different methodologies are 
used in such evaluations, based on cultivation or on direct 
assessment of nucleic acids extracted from environmental 
samples. Additionally, the recent application of multivariate 
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analysis to data from both techniques has allowed better 
comprehension of factors determining the composition of 
bacterial communities, and is highly useful to monitor shifts 
caused by changes in environmental conditions. Although 
none of the available techniques completely represents the 
real environment, the development of these methodologies 
will provide more accurate information. 

Bacterial diversity 

Microorganisms are a great source of genetic diversity, 
still far from completely known and explored (49). Bacteria 
are an important portion of this diversity, representing one of 
the three domains in the phylogenetic tree (Archaea, Bacteria 
and Eucarya) (78). 

The bacterial group has a long evolutionary history, 
conferring the capacity to inhabit most terrestrial niches. 
Bacteria are the main portion of biomass on Earth, and are 
responsible for some essential processes for life such as 
cycling of carbon, nitrogen and sulfur. Hence, additional to 
bacterial species diversity, there is intra-specific diversity. 
The bacterial genome is characterized by the total number of 
genes found in strains, which can be divided into two groups: 

(i) the core, composed of the group of genes found in at least 
95% of strains and are essential for the cell's life cycle; and 

(ii) the auxiliary group, found in a maximum of 5% of 
strains, and responsible for species adaptation in different 
environments (31). The core is maintained in species by 
speciation and vertical transmission, while the auxiliary 
group does not identify the species, since it is different in 
every strain. This last group of genes is also transmitted from 
strain to strain and even between species by horizontal gene 
transfer (6). 

This concept clearly demonstrates that bacterial diversity 
is not static, due to the high reproduction capacity associated 



with the short life cycle and high cell multiplication rates, 
which leads to the high adaptation value, and fast responses 
to environmental change (1, 31). 

The dynamic of bacterial communities colonizing plants 

A wide diversity of bacteria can interact with plants, 
composing bacterial communities with important roles in 
plant development and health status (22). These interactions 
can vary according to the host plant in a process similar to 
those widely known for pathogenic microorganisms (65). 

Bacterial populations are distributed in the rhizosphere, 
epiphytic and endophytic communities. The rhizosphere is 
commonly described as the soil portion directly influenced by 
root exudates; however, an updated definition of rhizosphere 
considers it as the soil compartment influenced by the root, 
including the root itself (25). Epiphytic and endophytic 
bacteria are characterized by the colonization of surface and 
inner tissues of plants, respectively. There is an ongoing 
discussion toward a better definition of these 
microorganisms; a commonly used definition of endophytes 
is those whose isolates form on surface-disinfected plant 
tissues (22). However, in addition to these definitions is the 
separation of endophytes according to their essentiality in 
niche occupations. In that case, the endophytic community is 
divided into "passenger" endophytes, i.e. bacteria that 
eventually invade internal plant tissues by stochastic events 
and "true" endophytes, those with adaptive traits enabling 
them to strictly live in association with the plant (24). Due to 
the novelty of this separation, and the problems involved in 
the methodological separation of these endophytic groups, we 
will consider in this review that the endophytic community is 
those bacteria that colonize inner tissues of healthy plants. 

The cells in the rhizosphere, plant-surface or endophyte 
communities are variable. A superficial analysis of these 
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communities could lead to the conclusion that there is a strict 
specificity for niche colonization. However, a more realistic 
scene is represented by the gradient of population distribution 
along plants. If a didactic approach is applied to explain 
bacterial communities associated with plants, it would divide 
these bacteria into distinct communities, with separation 
between epiphytic and endophytic communities in 
accordance with plant organs, such as roots, stems and 
leaves. However, in nature the gradient of distribution will 
prevail over separation. It is important to note that bacteria in 
the rhizosphere are often similar to those in the endophytic 
community and on leaf surfaces. 

This wide distribution is driven by plant development 
that carries bacteria over the plant tissues (48). Chi et al. (9) 
demonstrated that similar bacteria were distributed over the 
rice plant, from roots to leaves. However, the abundance of 
bacterial types along the different niches can differ, mainly 
due to differences in these niches in nutrient supply, 
atmospheric conditions and competitiveness with other 
components of these communities (52). The behavior of these 
populations and how they colonize plants is determined by 
environmental conditions, like formation of biofilms that help 
bacteria fix to cell walls, avoiding the migration driven by 
sieve transportation. Similarly, in the parenchymatic region, 
being single-celled can enable better contact with cells and so 
better nutritional supply for the bacterium. 

Methodologies to assess shifts in bacterial communities 
associated with plants 

A number of methodologies are available in the literature 
to assess bacterial communities and to compare what is found 
where (Figure 1); however, all have advantages and 
drawbacks. Prior establishment of the goals of any work is 
essential to determine the most suitable technique to answer 



experimental questions. Below is a description of common 
techniques used to assess shifts in microbial communities 
associated with plants. 

Isolation and culturability of bacterial communities 

Isolation in culture media is the most common 
methodology to access bacterial communities from different 
environments, mainly due to its simple application. However, 
this is a limited analysis of bacterial diversity (Figure 1), 
influenced by a number of factors, often under or 
overestimating bacterial diversity (53). 

The high diversity of bacteria in most environmental 
samples is hard to represent on a culture plate, since the vast 
majority of bacterial species do not grow on standard 
isolation media (75). It has been demonstrated that isolates 
obtained by plating do not represent their natural habitat 
because applied isolation methods only access a small subset 
of the total microbial community in the environment (15, 16, 

75) . Tentative improvements to cultivation are based on 
mimicking the environment that bacteria are from (23). For 
example, the supplementation of culture media with soil 
extracts can result in higher diversity of culturable species 
(23). Using a similar approach, recalcitrant and undescribed 
species related to Verrucomicrobium and Acidobacteria, 
which are both rather unexplored groups, were obtained (30). 
Also, changing incubation conditions of plates can stimulate 
the culturability of bacteria, e.g. changing atmospheric 
composition and addition of specific nutrients and signaling 
molecules (73). More specifically, addition of anti oxidative- 
stress compounds like pyruvate and catalase to culture media 
can lead to higher recovery of cells from soil and water (4, 5, 

76) . 

Although these tentative studies revealed improvements 
to culturability of bacterial diversity in distinct environments, 
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Figure 1. Scheme showing the possibilities for application of molecular microbiology techniques to study bacterial 
communities associated with plants. In black the techniques presented and discussed in this review. 



it is still essential to use polyphasic approaches for 
satisfactory evaluation of bacterial communities interacting 
with plants. 

Culture-independent methodologies to assess bacterial 
communities associated with plants 

The application of techniques based on analysis of 
nucleic acids (DNA or RNA) directly extracted from 
environmental samples is essential in microbial diversity 
studies; they can supply information in a culture-independent 
way, and exclude the limitations and bias from the low 
culturable portion of bacteria of these communities (51) 



(Figure 1). However, it should be noted that culture- 
independent techniques also have bias, mainly introduced 
during DNA extraction and amplification of target genes (3, 
7). 

DNA extraction is the first step to analyze bacterial 
communities in a culture-independent way, and is crucial for 
the outcome of any further molecular analysis. The DNA 
extraction strategy should enable yields representative of the 
indigenous community as well as enough purity and integrity 
for polymerase chain reaction (PCR) amplification. There is a 
wide variety of methodologies which allow retrieving of 
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DNA from samples; most were first developed for soil 
community analysis, but can also be applied to plant- 
associated bacteria. These methodologies are divided into 
two main groups: direct and indirect extraction methods 
(detailed, described and compared in a book chapter (43). 
Briefly, indirect methods separate bacterial cells from soil or 
plant tissues by centrifugation and cells are further 
chemically lyzed to release DNA. In contrast, the direct 
methodologies, commonly found in kits, do not separate cells 
from samples but process the entire sample. In this case the 
cell lysis is mechanical. Indirect methodologies result in a 
lower amount but more pure DNA, while direct 
methodologies result in a high amount of DNA, but possibly 
have environmental contaminants like humic acids. The final 
concern about methodologies is the representativeness of 
extracted DNA; direct extraction techniques have an 
advantage in this regard as the attachment of specific bacteria 
groups to soil particles or plant tissues is overcome (43). 

Considering the bias during DNA amplification, most 
techniques applied to microbial communities are based on 
differences in the ribosomal genes. The 16s rRNA is the most 
important gene in microbial ecology and bacterial phylogeny 
(37). However, the named universal primers are designed 
from sequences that are already described, which could limit 
the assessment of unknown sequences and consequently, not 
amplify any new bacterial groups. This primer selectivity was 
recently demonstrated where selection was driven by the use 
of different reverse primers (7). 

Although these concerns are known and considered, the 
bacterial diversity is assessed in different environments based 
on the 16S rRNA gene, and revealing the portion of the 
community not considered when culture-dependent 
approaches are applied (27, 49). For example, the bacterial 
communities in leaves from Atlantic forest trees were 



described using cloning and sequencing, which allowed the 
estimation of 400 phylotypes in each tree species (34). It is 
also important to show the wide applicability of these 
techniques, from the quantification of target groups and 
species by real time PCR to the fingerprinting of 
communities by Denaturing Gradient Gel Electrophoresis 
(DGGE; Figure 1). A refined analysis can also be considered 
by combining a culture-independent technique (e.g. DGGE 
fingerprinting) with multivariate analysis. These applications 
are discussed below with remarks on the possibilities for use 
in microbial communities associated with plants. 

Quantitative PCR in real time (qPCR) 

PCR is a mark on the advent of molecular biology, and 
has had great impact on techniques in microbial ecology. 
More recently, the quantitative real time PCR (qPCR) 
development was incorporated into the toolbox used in 
studies of bacterial communities associated with plants. 

The qPCR is a highly sensitive tool to quantify microbial 
populations within a sample, since it is based on detecting 
specific sequences of nucleic acids, and estimating the 
amount in a sample (33). Briefly, the amplification and 
detection of the sequences is performed by fluorescent 
markers present in the reaction. These markers respond to 
amplification, increasing the fluorescence emission after 
every amplification cycle, which is detected by the PCR 
machine. The emitted fluorescence is quantified and a 
threshold value of fluorescence (Ct) is achieved. The Ct value 
is used for interpolation on a standard curve of Ct values with 
known concentrations of target DNA (33). 

The most common fluorescent agent used is the SYBR 
GreenI, which can link to double-strand DNA and emit 
fluorescence when excited with wavelengths of 494-521 nm. 
Primer specificity is crucial when relying on SYBR green 
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detection, as lack of specificity may yield non-target (next to 
target) products whose detection will distort the 
quantification. Another widely used technique for qPCR 
analysis uses the so-called Taqman probes, which are based 
on a third oligonucleotide during annealing. This probe 
incorporates two molecules, the reporter and the quencher, 
which when close to each other, repress the fluorescence of 
the reporter; but after incorporation by DNA amplification, 
the reporter fluoresces, directly linking the signal intensity 
with the amount of target DNA amplification. In addition to 
these two techniques for qPCR, there are other variations in 
available methodologies, such as Lux and Beacon probes; 
Zhang and Fang (79) have reviewed these techniques. 

Applying the qPCR technique, bacterial species were 
quantified in association with plants. The pathogen Xylella 
fastidiosa was quantified in citrus samples (45) while the 
endophytic bacterium Methylobacterium mesophilicum was 
targeted by qPCR quantification during its colonization of the 
model plant Catharanthus roseus (32). In Brassica oleracea, 
the population of the growth-promoting bacteria 
Enterobacter radicincitans was monitored by qPCR 
associated with fluorescent in situ hybridization (61), 
determining not only the amount of bacteria colonizing plants 
but also their localization in the host plant. Besides direct 
quantification of microbial populations, qPCR can quantify 
microbial gene expression, where it has been used in 
important studies, as for example, in comparing the 
ammonium-oxidizer activities present in Bacteria and 
Archaea domains in soils (35). 

Fingerprinting techniques of bacterial communities 

Fingerprinting is the most common culture-independent 
approach for assessing the structure of bacterial communities. 
It gives an overview of the most abundant members in each 



sample, evidenced by patterns of bands or peaks. Amongst 
the techniques for fingerprinting bacterial communities the 
most widely used are restriction fragment length 
polymorphism (T-RFLP), single-strand conformation 
polymorphism (SSCP), automated rRNA intergenic spacer 
analysis (ARISA), temperature gradient gel electrophoresis 
(TGGE) and DGGE. These techniques were first developed 
and applied to soil microbiology and reviewed by Ranjard et 
al. (51) and Oros-Sichler et al. (46). The present review will 
focus on DGGE; however, some other fingerprinting methods 
will be briefly described. 

Analysis by T-RFLP is based on a differential display of 
restriction sites in bacteria of different taxonomical 
affiliations. T-RFLP uses a combination of fluorescent 
labeling of the forward primer after electrophoresis, thus a 
bacterial community is converted into a diagram of peaks, 
where each signal is related to a distinct bacterial taxon (36). 

SSCP is a fingerprinting methodology based on 
differential migration of secondary structures formed by 
single-strand DNA fragments (67). After denaturation of 
double-strand DNA from PCR amplification on 
environmental samples, samples that form secondary 
structures are electrophoresed in a non-denaturant gel and a 
band pattern produced, indicating the constitution of bacterial 
communities. In a recent study, Smalla et al. (70) compared 
the resolving power of DGGE, T-RFLP and SSCP; the three 
methodologies gave similar distinction of samples. Thus the 
choice of technique is based on the equipment availability in 
each laboratory. 

Denaturing Gradient Gel Electrophoresis (DGGE) 

DGGE is used in descriptions and comparisons of 
bacterial communities in different environments (51). A 
DGGE gel separates amplicons with similar numbers of 
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nucleotides, based on differences in composition (the GC 
percentage) (41). This technique has a high resolution and 
permits rapid processing of many samples (42). 

In the beginning, the comparisons among communities 
were based on amplicons obtained with so-called universal 
primers, e.g. for the domain Bacteria (29, 41). Later, specific 
primers targeting microbial groups were developed, resulting 
in better assessment of some populations and better band- 
pattern definition for complex communities, such as soil and 
rhizosphere communities. Some of these specific primers 
target large bacterial groups like classes a and P- 
proteobacteria (21) or orders like Actinobacteria (29). More 
specifically, primers were also developed to access restricted 
populations, e.g. genera Pseudomonas (19, 40, 77), 



Paenibacillus (12) and Burkholderia (63). A remark should 
be made about using DGGE to access bacterial communities 
in plants, especially in endophytic communities. Due to the 
prokaryotic origin of chloroplasts, they harbor copies of the 
16S rRNA gene, which compete for amplification with 
bacterial DNA. It was originally a problem when 
fingerprinting endophytic communities due to the high 
amount of chloroplast DNA, which limit the detection of 
bacteria in these samples. However, a first PCR that uses 
primers 799F (8) and 1492R, prior to GC clamp 
amplification, increases the bacterial DNA concentration. All 
these primers, universal and specific, were based on the 
sequence of the 16S rRNA gene (Table 1). 



Table 1. Available primers to assess specific bacterial communities based on phylogenetic or functional genes in 



environmental samples. 



Target 


Primer 


Sequence (5'->3') 


Reference 


16S rRNA 


968F* 


AA CGCGAA GAA CCTTA C 


(29) 




R1387 


CGGTGTGTA CAA GGCCCGGGAA CG 


(29) 


Bacterial 1 6S rRNA 


1492R 


TA CGGYTA CCTTGTTA CGA CT 


(54) 




799F 


AA CMGGA TTA GA TA CCCKG 


(8) 


Alphaproteobacteria 


Alpha-U 


CCGCA TA CGCCCTA CGGGGGAAA GA TTT 
AT 


(21) 


Betaproteobacteria 


Beta-2 


CGCA CAA GCGGTGGA TGA 


(21) 


Actinobacteria 


F243 


GGA TGA GCCCGCGGCCTA 


(29) 


Pseudomonas spp. 


PsF 


GGTCTGA GA GGA TGA TCA GT 


(19) 


Pseudomonas spp. 


PsR 


TTA GCTCCA CCTCGCGGC 




Pseudomonas spp. 


F311PS 


CTGGTCTGA GA GA GGA TGA TCA GT 


(40) 


Pseudomonas spp. 


R1459PS 


AA TCA CTCCGTGGTAA CC 




Paenibacillus spp. 


PAEN515F 


GCTCGGA GA GTGA CGGTA CCTGA GA 


(12) 


Burkholderia spp. 


BurkR 


TGCCA TA CTCTA GCYYGC 


(63) 




Burk3* 


CTGCGAAA GCCGGA T 




rpoB 


1698F* 


AA CA TCGGTTTGA TCAA C 


(13,47) 




204 1R 


CGTTGCA TGTTGGTA CCCA T 




nifli 


NHA1* 


TCCA CTCGTCTGA TCCTG 


(58) 




NHA2 


CTCGCGGA TTGGCA TTGCG 


(58) 


mxaF 


mxaFlOOl* 


GCGGCACCAACTGGGGCTGGT 


(17, 39) 




mxaR1557 


GGGCA GCA TGAA GGGCTCCC 




amoA 


AmoA IF* 


GGGGTTTCTA CTGGTGGT 


(59) 




AmoA2R 


CCCCTCKGSAAA GCCTTCTTC 




gacA Pseudomonas spp. 


gacAl-F* 


A TTA GGGTGCTA GTGGTCGA 


(10) 




gacA2-R 


GGTTTTCGGTGA CA GGCA 





*forward primers, where the GC clamp was added at the extremity 5' 
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Primers based on other genes have also been developed. 
The first usage of optional genes to be targeted in a DGGE 
analysis was based on the rpoB gene, which codifies for the 
RNA polymerase enzyme (47). Other functional genes 
exploited for DGGE analysis are the nifH, present in 
nitrogen-fixing bacteria (58) and the mxaF gene, present in 
Methylobacterium spp. and other methylotrophic bacteria 
(17, 28). More recently DGGE was used to screen 
ammonium-oxidizing bacteria and archaea (18, 59, 62) in 
association with clone sequencing. The advantage of analysis 
based on functional genes is the possibility of answering not 
only who is there, but how diverse the active community is in 
the process in that environment. 

Considering the many primers available (Table 1) and 
the capacity for fast processing of many samples, DGGE is 
an important approach to study bacterial communities, 
supplying information about shifts on composition. 

Ordination methods of multivariate analysis applied to 
DGGE fingerprints 

The increasing application of DGGE in determining the 
composition of bacterial communities and the shifts caused 
by environmental changes revealed that simple visual 
analysis of DGGE fingerprints is not sufficient to explore the 
data generated. The improvement of information extraction 
from DGGE patterns was first made by clustering 
fingerprints, based on correlation of densitometric curves. 
Furthermore, areas of bands along fingerprintings were used 
to associate numbers with DGGE patterns, such as using the 
Shannon diversity index. More recently, multivariate 
analyses were applied to DGGE data, presenting results in 
ordination plots (11, 64). A detailed review of applications of 
multivariate analysis in microbial ecology was recently 
published (50). 



Multivariate analysis can be defined as data processing 
that combines different measurements from the same sample, 
and inferring correlations and interactions of factors. 
Although a simple clustering analysis based on different 
parameters is a multivariate analysis, the most common 
presentations of these techniques are ordination methods. 
Ordination is the collective term for multivariate techniques 
that arrange sites along axes on the basis of data on species 
composition, resulting in a diagram in which sites are 
represented by points in a two-dimensional plot (20, 74). 

Focusing on microbial ecology related to plants, the main 
advantage of multivariate analysis is the possibility of 
including in the one analysis all factors which may influence 
composition of bacterial communities. This approach was 
first applied in bacterial communities from soils (72) and 
further in the analysis of historical usage of soil and plant 
species cultivation in the structure of the Burkholderia spp. 
community (64). Similarly, using multivariate analysis, it was 
demonstrated that the rhizosphere environment determines 
the selection of Pseudomonas spp. to interact with plants, in 
detriment to the local of plant cultivation (1 1). 

A series of steps to apply multivariate analysis is 
presented in Figure 2. Briefly, gels with fingerprints 
generated from DGGE analysis are converted into matrices 
where presence and absence of each band is considered in 
each sample. Alternatively, the normalized intensity of the 
band can be used as an inference of the frequency of species 
in samples. These matrices can be first used for species-based 
analyses like Principal Components Analysis (PCA) or 
Correlation Analysis (CA). These analyses are more 
exploratory and trace correlations between samples, not 
considering environmental factors. In a comparison, these 
plots are similar to the commonly presented trees generated 
by clustering analysis. The decision of whether CA or PCA 
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Figure 2. Steps for multivariate analysis using data from molecular microbiological methods. The sequence is the obtaining of 
the fingerprint (a), the conversion in matrices with qualitative or quantitative data and the combination with data for samples 
classification (b), the gradient analysis (c) and the application of the best fitted mathematical model (d). 



425 



Andreote, F.D. et al. 



should be used is made on the basis of the gradient size of the 
species distribution. The gradient size is commonly estimated 
by Detrended Correspondence Analysis (DCA), where the 
gradient size of the first axis is considered. It is assumed that 
values > 4.0 indicate normal distribution of data, and values < 
3.0 indicate that a mathematical model based on linear 
distribution would be better. For intermediate values both 
methods can be applied, and normally the one which better 
represents factors is presented in articles. Species-based 
techniques often use PCA for linear data and CA for normal 
distributed data (Figure 2). 

In addition, the correlation between species occurrence 
and environmental data from samples can be analyzed. These 
environmental data can be nominal, characterized by 
classification of samples with qualitative information; or 
quantitative, when parameters are measured in samples or in 
their place of collection. Examples of nominal variables are: 
whether the sample is obtained from transgenic plants or the 
location where samples were collected. Quantitative variables 
are numerically determined values, such as temperature or 
pH. 

Multivariate analysis is performed by combination of 
species and environmental data in a species-environmental 
data analysis. The distribution of the species data, determined 
by DCA, indicates the best model: linear data indicates 
Redundancy Analysis (RDA), and normal data indicates 
Canonical Correspondence Analysis (CCA). Hence, to infer 
the significance of each environmental factor in the 
composition of species in samples, a Monte Carlo 
permutation test is used, generating P values for each 
considered environmental factor. 

Another concern about multivariate analysis is 
interpretation of graphics and values for either RDA or CCA 
analyses. A brief description is that samples and 



environmental variables are distributed over the axes 
revealing separation on different quadrants. The bi- 
dimensional distribution makes the first separation on the x- 
axis, revealing main differences in horizontal separation, 
while the second factors of variation are plotted on the y-axis. 
Considering that, as far from the center of the plot are 
samples, as stronger are vectors, and higher is the separation 
of samples or the importance of such factor in the 
composition of bacterial communities. Also, vectors pointing 
in the same direction are variables which respond similarly to 
variations, and samples where vectors point, are those where 
the factor is more intense. However, nominal variables, 
represented by qualitative data cannot be represented by 
vectors, and become centroids in plots. Samples related to 
these variables are commonly distributed around the centroid 
named after the variable. 

For better understanding, we present an example where a 
DGGE gel (Figure 2a) is converted into a matrix of relative 
band surface, further subjected to multivariate analysis, of 
correlating bands with two quantitative (pH and temperature) 
and two qualitative (transgenic or wild type) environmental 
variables (Figure 2b). After measurement of gradient 
distribution of data (Figure 2c), multivariate analysis 
produced a plot (Figure 2d). If environmental variables are 
not considered, PCA or CA only shows the correlation 
between samples; if RDA or CCA analyses are made, 
information about environmental factors is achieved. In this 
case, pH and plant transgenic status were revealed to be 
determinant of the composition of species in such 
communities (vector and centroids related to the x-axis), 
while there were secondary effects due to variation in 
temperature (y-axis related vector) (Figure 2d). 

Such observations are consistent with all ordination plots 
obtained by multivariate analysis. In the beginning the 
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graphical representation seems complicated, but after initial 
dealing it can very clearly be seen to show the relationships 
between samples and variables. 

Construction and analysis of clone libraries 

The construction and analysis of ribosomal gene libraries 
is a highly sensitive tool to study environmental microbial 
ecology, allowing comparison of sequences from different 
samples, with resolution at different taxonomic levels. This 
technique is also a culture-independent approach that 
overcomes problems found when the culturable fraction of 
microbial communities is sampled (66). 

This technique was used to describe fungal community 
diversity in forest soils, by sequencing 863 clones of fungal 
Internal Transcribed Spacer, determining the presence of 412 
phylotypes composing this community (44). In a similar 
study, the bacterial community colonizing the leaf surfaces in 
Atlantic forest was assessed, and showed that every tree 
species harbored a different bacterial community, indicating 
that bacterial diversity in the forest had been underestimated 
(34). The application of the pyrosequencing revealed > 
10,000 bacterial phylotypes in an ocean community (71). 
Clone libraries were used to describe bacteria on the roots of 
maize plants (8), determining qualitatively and quantitatively 
which species were present in this niche. 

The main advantage of this technique is the variety of 
analyses possible with the sequence groups. The libraries can 
be used for comparisons, ecological index estimates, 
rarefaction analyses and phylogenetic inferences on members 
of the determined community (Figure 2; (66, 69). Recently, 
the application of this technique has expanded to other genes 
beside the 16S rRNA. Diversity of functional genes has been 
investigated by this technique, for example, the nitrate 
redutace gene (nirK) in water and plant-related environments 



(2, 60). 

High throughput approaches to assess the diversity of 
bacterial communities 

Another important scope in every review of microbial 
ecology is the upcoming high throughput techniques, such as 
microarray-based profiling of bacterial communities (68, 80) 
and pyrosequencing (14). These are very promising to aid 
understanding of communities and interactions of bacteria 
associated with plants. 

Microarrays are a technique widely used for high-scale 
gene detections and gene expression quantification in 
different organisms. Recently, this technique was adapted to 
profiling of environmental communities. The main advantage 
of the microarray technique over other methodologies is 
quantitative assessment of bacterial diversity. For a review of 
application of this approach see Zhou (80). The most recent 
version of microarray profiling is the Geochip (26), 
consisting of 24,243 oligonucleotide probes (with 50 bases) 
which covers > 10,000 genes spread over > 150 functional 
bacterial groups, involved in metal reduction and resistance, 
organic contaminant degradation, and nitrogen, carbon, sulfur 
and phosphorus cycling (26). 

Pyrosequencing has broken the barrier of sequence 
limitations in study of bacterial diversity. With the ability to 
generate megabases of sequences in a few hours, it allows 
deep exploration of species in any environmental sample 
(14). This technique differs from Sanger methodology, which 
is based on incorporation and further detection of 
fluorescently labeled ddNTP (dideoxynucleotide 
triphosphate). During pyrosequencing, only one dNTP 
(deoxynucleotide triphosphate)is available at a time, and 
incorporation of this nucleotide generates the signal detected 
by the equipment (38, 56, 57). The signal is emitted once 
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complementarity is achieved and the base is incorporated; 
then a reaction is catalyzed and pyrophosphate is released, 
activating ATP sulfurilase, producing energy to the luciferase 
to convert luciferin into oxyluciferin, releasing the light 
signal. Using these reaction steps, an extremely low amount 
of reaction is required, allowing up-scaling of the process. In 
this way, the latest version of instruments can produce around 
300,000 reads with approximately 200-400 bp in about 5 h. 

This technique has been used to describe bacterial 
communities in different environments. The deep ocean 
biosphere was described by pyrosequencing of samples 
collected at different depths (71), and soil bacterial 
communities (55) were similarly investigated. Both studies 
showed that diversity of these organisms was extremely high, 
and although 30,000 sequences were obtained, the complete 
description of species and sequences in both environments 
were not completed. The authors showed that the great 
majority of species were described, but there remained the 
'rare biosphere tail' that was not completely explored, even 
with the high amount of sequences. Although the microbial 
diversity associated with plants seems to be less diverse, 
particularly for the endophytes, the future application of the 
technique in this field is promising. 

Final considerations 

Characteristics of bacterial communities associated with 
plants show it to be an interesting area for research, 
promising to supply information on microbial ecology, 
environmental changes and also acting as a reservoir of 
species with potential usage in agriculture and industry. The 
application of molecular microbiology techniques allows 
culture-independent approaches in the investigations of 
bacterial communities. Although development of new 
techniques and insights allows better assessment and 



description of bacterial communities, the dynamics of these 
organisms are still assessed similarly to photographs, taken at 
the moment of sampling, but not guaranteed to be the same 
five minutes later. 

RESUMO 

AVALIACAO DA DIVERSIDADE DE C OMUNID ADE S 
BACTERIANAS ASSOCIADAS AS PLANT AS 

As interacoes planta-bacteria resultam de um 
reconhecimento reciproco de ambas especies. Estas 
interacoes sao responsaveis por processos biologicos 
essenciais para o desenvolvimento e a protecao das plantas. 
Este trabalho revisa as metodologias aplicadas na 
investigacao de alteracoes nas comunidades bacterianas 
associadas as plantas. Uma descricao das tecnicas e feita, 
desde o isolamento ate a aplicacao de tecnicas independentes 
de cultivo, destacando as tecnicas de qPCR, Gel de 
Eletroforese em Gradiente Desnaturante (DGGE), construcao 
e analise de bibliotecas de clones, a aplicacao de analise 
multivariada em dados de ecologia microbiana, e as novas 
metodologias de alto processamento de amostras como 
microarranjos e pirosequenciamento. Em resumo, esta 
revisao fornece informacoes sobre o desenvolvimento das 
tecnicas tradicionais e uma visao geral sobre as novas 
tendencias dos estudos de comunidades bacterianas 
associadas as plantas. 

Palavras-chave: interacao planta-bacteria, tecnicas 
moleculares, analise multivariada, endofiticos, rizosfera 
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